Sparse Linear Identifiable Multivariate Modeling

نویسندگان

  • Ricardo Henao
  • Ole Winther
چکیده

In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully Bayesian hierarchy for sparse models using slab and spike priors (two-component δ-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable computational complexity. We attribute this mainly to the stochastic search strategy used, and to parsimony (sparsity and identifiability), which is an explicit part of the model. We propose two extensions to the basic i.i.d. linear framework: non-linear dependence on observed variables, called SNIM (Sparse Non-linear Identifiable Multivariate modeling) and allowing for correlations between latent variables, called CSLIM (Correlated SLIM), for the temporal and/or spatial data. The source code and scripts are available from http://cogsys.imm.dtu.dk/slim/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse Linear Dynamical System with Its Application in Multivariate Clinical Time Series

Linear Dynamical System (LDS) is an elegant mathematical framework for modeling and learning multivariate time series. However, in general, it is difficult to set the dimension of its hidden state space. A small number of hidden states may not be able to model the complexities of a time series, while a large number of hidden states can lead to overfitting. In this paper, we study methods that i...

متن کامل

Modeling of temperature in friction stir welding of duplex stainless steel using multivariate lagrangian methods, linear extrapolation and multiple linear regression

In this study, the temperature in friction stir welding of duplex stainless steel has been investigated. At first, temperature estimation was modeled and estimated at different distances from the center of the stir zone by the multivariate Lagrangian function. Then, the linear extrapolation method and multiple linear regression method were used to estimate the temperature outside the range and ...

متن کامل

Modeling of temperature in friction stir welding of duplex stainless steel using multivariate lagrangian methods, linear extrapolation and multiple linear regression

In this study, the temperature in friction stir welding of duplex stainless steel has been investigated. At first, temperature estimation was modeled and estimated at different distances from the center of the stir zone by the multivariate Lagrangian function. Then, the linear extrapolation method and multiple linear regression method were used to estimate the temperature outside the range and ...

متن کامل

Optimizations for Tensorial Bernstein-Based Solvers by Using Polyhedral Bounds

The tensorial Bernstein basis for multivariate polynomials in n variables has a number 3n of functions for degree 2. Consequently, computing the representation of a multivariate polynomial in the tensorial Bernstein basis is an exponential time algorithm, which makes tensorial Bernstein-based solvers impractical for systems with more than n= 6 or 7 variables. This article describes a polytope (...

متن کامل

Exploiting Hidden Persistent Structures in Multivariate Tensor-Based Morphometry and Its Application to Detecting White Matter Abnormality in Maltreated Children

We present novel multivariate tensor-based morphometry (TBM) for characterizing white matter abnormalities. Traditionally TBM is used in quantifying tissue volume changes in a massive univariate fashion. At each voxel, the Jacobian determinant obtained from TBM is used as the response variable in a general linear model (GLM) and a test statistic is constructed. However, this obvious approach ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2011